perm filename TOPIC.1[AP,DBL] blob sn#116987 filedate 1974-08-26 generic text, type T, neo UTF8
00100	This is a preliminary sketch of an idea for an AI system which possesses 
00200	Mathematical Intuition and is able to find proofs and counterexamples by
00300	manipulating data structures and little programs which it itself has 
00400	written.  
00500	
00600	The system would "read through" a math text, e.g. one on Topology, and as
00700	each new concept is defined, incorporate it into its expertise by composing
00800	data structures, programs to operate on the data structures, etc.   The idea
00900	of intuition appearing  derives from the fact that these little models may
01000	be informal, based on real-world analogies, for example.  Thus the system
01100	must have a firm grounding in physical manipulations and structures, as well
01200	as in arithmetic, set theory, and what ever else the chosen text prerequires
01300	of its readers.
01400	
01500	Thought must be given to what level of text to choose, and what level of
01600	mastery to strive for, and whether the book may be recoded into a simpler
01700	internalizable format by hand and then read in.  A high-level field like
01800	topology has its attractiveness in that (i) very few people EVER are able
01900	to build up an intuition in this domain, and (ii) this may be caused by
02000	bookkeeping  strains beyond the capacity of our STM and its associated
02100	LTM retrival system; the field is simply a massive set of definitions
02200	and the theorems are simply relations between these definitons; the more
02300	apparent the relation, the less startling the theorem.
02400	
02500	Intertwined with all these decisions are the key ones: what exactly does the
02600	system build up, and how do these pieces interact when confronted by a
02700	proposition whose truth we must intuit and then try to prove/disprove?
02800	Let me show a partial answer, again for topology. Suppose we have come to
02900	a new property's definition, what do we do? Humans would keep reading on,
03000	reread the definition, reread definitions of words mentioned in the defini-
03100	tion, skip ahead to see how and how often the property is mentioned, think 
03200	about spaces which do and don't have this property, etc.
03300	The system should be concerned with gleaning similar information:
03400		FORMAL   various alternate forms of the definition
03500		GEOMETRIC  get examples of spaces which do/don't have the property,
03600	specifically: simplest, typical, barely, simplifying regularities
03700		OPERATIONS  how to manipualte the geometric and formal models
03800	to show space X has property P, to show space X does not have property P,
03900	for each forseeable way in which P(X) could fail to hold   provide a hint
04000	as to how to alter X to overcome this flaw,  ways to change a space X
04100	to ensure that P(X) doesn't hold.
04200		LINKS    how is P related to other properties (Q, R, ...)
04300	P → Q,   P ← Q,   P ≡ Q,   P∧Q → R,   etc.
04400	what proof techniques will be useful in proving theroems involving P? why?
04500	
04600	Of course, this discussion is still far too nebulous to hand to a programmer
04700	and say "do it!"   Serious consideration must be paid to the details of the
04800	interactions between the pieces when confronted by a new statement.  In a
04900	domain where even the humans have poor intuition, our poor system may suffer
05000	from no one to rear it properly!  So perhaps an elementary school level text
05100	is more desirable, albeit less impressive.
05200	
05300	The hope is that the system will read a new statement like 'any X space
05400	which is Y is also Z' and use its models of the three properties to convince
05500	itself that this is reasonable, or else to decide it is startling. The first
05600	case should point to a plan for a proof, while the second should point to
05700	possible counterexamples. If these latter turn out to satisfy the statement,
05800	after all, then the models should be adjusted until this seems reasonable.
05900	This reconciliation should itself help guide us toward a proof. In a similar
06000	way, a careful examination of a Gelernter model could suggest useful facts
06100	to establish along the road to a geometry proof (lemmas about equal angles
06200	or segments induced from analytic geometric measurements.) This is the state
06300	of the idea as of 7/16/74. I have talked with some members of the AP group
06400	who have suggested how convexity and openness might be represented, and how
06500	they might interact in proving that the union of disjoint open sets is not
06600	convex.
     

00100	
00200		The type of program I am looking for, to be a diss. target program,
00300	must be v. long (beyond human abilities, at least current existence)
00400	and ideally nicely separable into small pieces (of only several types).
00500	Why write a program in this way; why not write a shorter but slightly
00600	more intricate prog. to begin with (i.e., why write a loop unrolled,
00700	etc.)?
00800		One reason could be efficiency, but I think that AP is not then
00900	the right way to go: this would just be a good compiler.
01000		Another reason could be that we build up the program incrementally,
01100	over time and perhaps over independant workers, thus no one person at
01200	any one time can write the whole program; i.e., we have some pieces only
01300	at each instant, and yet we must use the program as well as fill it out.
01400	This is like the topol. system just described.
01500		A related thought, perhap a new proposal itself, is that the
01600	program might bootstrap itself: we might want it to be able to go thru
01700	Knuth, using its current knowledge to interpret the new programming
01800	knowledge into a usable form, and periodically doing the exercises,
01900	writing bigger and better programs, solving harder problems, etc.
02000	This is a little like studying English in English; it is how we
02100	all learned to program well, how we all learned Math, etc.
02200	Note that the format of inputting the new info. must be different
02300	from the final usable internal form, or else that part of the
02400	system is nonexistent; this might be a nice separation of the
02500	project: assimilating new knowledge; using current knowledge to
02600	do the exercises.  Notice that the latter task also implies that the
02700	exercise statemnt has been understood in some sense.
02800	In PUP6, the translation is entirely separate: it is done via a
02900	production system assembled from each being's IDEN part, a simple hack.
03000	It is the usage, the synthesis, which is emphasized, which is intelligent.
03100		As Knuth says himself, chap. 1 is mainly for reference; it should
03200	be memorized without too much attention to reasonable intelligence.
03300	Chap. 2 may be crucial, may be THE chapter to work on: it teaches
03400	programming (also include sec. 1.4, intro. programming. techniques in MIX).
03500	Perhaps a different source of new knowledge would be better, one geared
03600	specifically to AI programming techniques in LISP; prhaps even just
03700	the INTERLISP manual.   Example: we learn about the function TCONC
03800	and (i) set up a new being who knows when to use it, (ii) go thru the
03900	existing code -- both the system and the partial target -- and replace
04000	some of the appends and nconcs with tconcs, if that is better.
04100		Still another idea (new proposal?) would be involved with
04200	Computer-Aided Instruction: either the generation of such teaching
04300	programs, or merely being such a program (interacting with the learner),
04400	similar to the computer consultant task described in the speech report.
04500	A third alternative would be that of the learner: get a program which can
04600	test out and learn from CAI systems!
04700		Relating more to pup6 again, we must ask whether the system
04800	should be allowed to make buggy programs; if so, it should (unlike pup6)
04900	have a vast amount of info for analyzing the bugs (detecting bugs
05000	syntactically (by examination of the code), semantically (by receiving
05100	and processing breaks during runtime), logically (by understanding
05200	what should be happening vs what is really happening), and pragmatically
05300	(by having some estimate as to how costly each phase should be vs is)).
05400	Also, should it synthesize from scratch, or modify, or both?  If it 
05500	modifes: just its own programs, or specially commented, or specially
05600	written (eg, structured), or supplemented with semantic explanations, etc.?
05700	The role of system learning vs system performing must be made clear.
05800	That is, what is the system supposed to do, what task or problem should
05900	it handle, and secondly: how does it expand itself, its knowledge
06000	(eg: at least: understand the task posed; at most: modify all its own
06100	code when new relevant info is learned).